In [22]:
import pandas as pd
import seaborn as sns
import plotly.express as px

import matplotlib.pyplot as plt
from matplotlib.pyplot import figure
In [6]:
import plotly.io as pio
pio.renderers.default = "plotly_mimetype+notebook"

Matplotlib¶

For this excercise, we have written the following code to load the stock dataset built into plotly express.

In [7]:
stocks = px.data.stocks()
stocks.head()
Out[7]:
date GOOG AAPL AMZN FB NFLX MSFT
0 2018-01-01 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
1 2018-01-08 1.018172 1.011943 1.061881 0.959968 1.053526 1.015988
2 2018-01-15 1.032008 1.019771 1.053240 0.970243 1.049860 1.020524
3 2018-01-22 1.066783 0.980057 1.140676 1.016858 1.307681 1.066561
4 2018-01-29 1.008773 0.917143 1.163374 1.018357 1.273537 1.040708

Question 1:¶

Select a stock and create a suitable plot for it. Make sure the plot is readable with relevant information, such as date, values.

In [28]:
fig, ax = plt.subplots()
ax.plot(stocks.loc[:,'date'],stocks.loc[:,'GOOG'])
# set title
ax.set_title('Google stock')
# horizontal axis
ax.set_xlabel('Date')
ax.set_xticks(stocks.loc[::10,'date'])
# vertical axis
ax.set_ylabel('Stock value')
fig.set_size_inches(18.5, 10.5)
plt.show()

Question 2:¶

You've already plot data from one stock. It is possible to plot multiples of them to support comparison.
To highlight different lines, customise line styles, markers, colors and include a legend to the plot.

In [29]:
fig, ax = plt.subplots()
ax.plot(stocks.loc[:,'date'],stocks.loc[:,'GOOG'],label = "GOOG")
ax.plot(stocks.loc[:,'date'],stocks.loc[:,'AAPL'],label = "AAPL")
ax.plot(stocks.loc[:,'date'],stocks.loc[:,'AMZN'],label = "AMZN")
ax.plot(stocks.loc[:,'date'],stocks.loc[:,'FB'],label = "FB")
ax.plot(stocks.loc[:,'date'],stocks.loc[:,'NFLX'],label = "NFLX")
ax.plot(stocks.loc[:,'date'],stocks.loc[:,'MSFT'],label = "MSFT")
# set title
ax.set_title('Stocks')
# horizontal axis
ax.set_xlabel('Date')
ax.set_xticks(stocks.loc[::10,'date'])
# vertical axis
ax.set_ylabel('Stock value')
ax.legend()
fig.set_size_inches(18.5, 10.5)
plt.show()

Seaborn¶

First, load the tips dataset

In [30]:
tips = sns.load_dataset('tips')
tips.head()
Out[30]:
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4

Question 3:¶

Let's explore this dataset. Pose a question and create a plot that support drawing answers for your question.

Some possible questions:

  • Are there differences between male and female when it comes to giving tips?
  • What attribute correlate the most with tip?

My question¶

Is there relation between day of the week and the amout of total tips sum for the day? And if yes what day is the most profitable?

In [31]:
# YOUR CODE HERE
g = sns.FacetGrid(tips, col='day')
g.map(sns.scatterplot, 'total_bill', 'tip')
g.add_legend()

fig = px.bar(tips, x="tip", y="day", orientation='h')
plt.show()
fig.show()

Plotly Express¶

Question 4:¶

Redo the above exercises (challenges 2 & 3) with plotly express. Create diagrams which you can interact with.

The stocks dataset¶

Hints:

  • Turn stocks dataframe into a structure that can be picked up easily with plotly express
In [32]:
# YOUR CODE HERE
pd.options.plotting.backend = "plotly"
fig = px.line(stocks, stocks.loc[:,'date'], y=[stocks.loc[:,'GOOG'], stocks.loc[:,'AAPL'],\
                                               stocks.loc[:,'AMZN'],stocks.loc[:,'FB'],stocks.loc[:,'NFLX'],stocks.loc[:,'MSFT']])
fig.show()

The tips dataset¶

In [33]:
fig1=px.scatter(tips, x="total_bill", y="tip", color="day")
fig1.show()

fig2 = px.bar(tips, x="tip", y="day", orientation='h')

fig2.show()

Question 5:¶

Recreate the barplot below that shows the population of different continents for the year 2007.

Hints:

  • Extract the 2007 year data from the dataframe. You have to process the data accordingly
  • use plotly bar
  • Add different colors for different continents
  • Sort the order of the continent for the visualisation. Use axis layout setting
  • Add text to each bar that represents the population
In [51]:
#load data
df = px.data.gapminder()
df = df.loc[df['year'] == 2007]
df = df.groupby('continent').sum()
fig1 = px.bar(df, x="pop", y=df.index, orientation='h', color=["orange", "red","green", "blue","purple"], text_auto='.2s')
fig1.update_layout(barmode='stack', yaxis={'categoryorder':'total descending'})

fig1.show()